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Abstract — We extend the study by Ornstein and Weiss on the 
asymptotic behavior of the normalized version of recurrence 
times and establish the large deviation property for a certain class 
of mixing processes. Further, an estimator for entropy based on 
recurrence times is proposed for which large deviation behavior 
is proved for stationary and ergodic sources satisfying similar 
mixing conditions. 

I. Introduction 

For a stationary and ergodic source with finite alpha- 
bet, the asymptotic relationship between probability of an n 
length sequence and entropy has been well established by 
Shannon-McMillan-Breiman Theorem [16J . Later, Omstein 
and Weiss ITSl established a similar expression relating recur- 
rence times to entropy. Kontoyiannis fl^ related recurrence 
times and probability of an n length sequence for Markov 
sources by showing that lim„^oo log[i?n(-''^)-P(-'^r)] = 
o{n^) a.s., for any /3 > 0. Here, Rn{X) and P(X") represent 
random variables for recurrence time and probability of an n- 
length block generated by a source X, respectively. Further, 
in lfT2l Corollary 2], he also identified a class of processes 
for which central limit theorem (CLT) and law of iterated 
logarithm (LIL) hold true for recurrence times. 

The question of large deviations for Shannon-Mcmillan- 
Breiman Theorem has been successfully answered in literature 
under certain mixing conditions [ 16 1. Motivated by Kontoyian- 
nis' results and the satisfaction of large deviation property 
for Shannon-McMillan-Breiman Theorem, it is natural to ask 
under what conditions the asymptotic recurrence times relation 
satisfies large deviation property. Chazottes and Ugalde [6 1 in 
2005 established partial large deviations results on recurrence 
times for Gibbsian sources. In this paper, we have found a 
class of processes for which large deviation property holds 
for recurrence times. 

For an i.i.d source, Shannon-McMillan-Breiman Theorem 
satisfies large deviation property by direct application of 
Cramer's Theorem [8 1. However for Ornstein and Weiss recur- 
rence times result, even for an i.i.d. source Cramer's Theorem 
is not applicable. This makes the analysis of large deviation 
property for recurrence times non-trivial even for the i.i. d case. 
Hence, in order to answer the question of large deviations 
for recurrence times one needs to look more closely into the 
recurrence time statistics. 

Maurer [[141 studied the behavior of recurrence time statis- 
tics under the assumption of non-overlapping recurrence 



blocks for i.i.d sources. Later, Abadi and Galves [U studied 
a similar non-overlapping scenario for V^-mixing processes 
and established an exponential bound on the recurrence time 
distribution. Moreover, they also brought out the contrast be- 
tween overlapping and non-overlapping case. In the context of 
overlapping Rn{x), there are several references that show con- 
vergence in distribution of Rn{x)P{xi) to an exponentially 
distributed random variable for a certain class of stationary and 
ergodic processes [2|[4|17|19||10|. Kim [11 1 also studied the 
behavior of conditional distribution of Rn{X)P{Xi) given 
the n-length block X" = and established an exponential 
bound on its distribution for these two classes of sources i) tp- 
mixing, ii) ^-mixing with summable coefficients. In this paper, 
in order to do our analysis we have used this exponential bound 
on conditional distribution established by Kim [fTT|. 

The rest of the paper is organized as follows. In section II 
we state preliminary results on recurrence time statistics and 
mixing processes. In section III, we state our main theorems 
for the large deviation property of recurrence times. In section 
IV, we give proofs of these theorems and their corollaries. 
In section V, we define an estimator for entropy based on 
recurrence times and prove large deviation property for it. In 
section VI, we present our conclusion. 

II. Preliminaries 

Let {Xn}n='^oo be a stationary and ergodic process defined 
on the space of infinite sequences (^!^g^,CT, P). Here A is a 
finite set of alphabets, a is the sigma field generated by finite 
dimensional cylinders and P is the probability measure. For 
simplicity of notation, we will use X for {X„}J^^ 
X is called T/^-mixing if 
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Here, V'lO a decreasing sequence converging to and 
denotes the sigma algebra generated by X^ ~ XiXi^i....Xj 
and it is called 0-mixing if 

\P{AnB)~ P{A)P{B)\ 

sup 



sup 
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Here, 4){l) is a decreasing sequence converging to 0. 

Let {xn}^'^°^ao denote a particular realization of X. Now 

define, the first return time (recurrence time) of to be: 



R^{x) = min{.7 > 1 : .t'/ = xrJXl} 



As a dual of recurrence time Rn{x), match length Lm{x) is 
defined as follows: 



L,n{x) = max{j >l:x{ 



k = 1,2, ...,m}. 



Observation 1 |12|: i?„(a;) > Lm{x) < n. 

On asymptotic behavior of i?„ (x) and (x) following holds, 

Ornstein and Weiss [15| 

For X with entropy rate H{X), with probability 1, 

logm 



lim 



lim 



H{X). 



(3) 



m->-oo Lm{X) 

Further, in this paper unless stated otherwise, i) we use H 
to represent the entropy rate of the source X. ii) A cj)- 
mixing process is assumed to be (/)-mixing in both forward 
and backward directions. 
Kim's Theorem 1111 

For X satisfying 7/)-mixing condition or (/)-mixing condition 
with summable coefficients, 

2^c,i^^Pix^)tvi)) y t>o. 

P{Rn{X)>t\Xl^ ^x'l) <^^e-*^^''^^"^[l + K{x,t) + 

+ 2C^{^JP{x'1)Vl)]\ft>p.,. 

(4) 

where = C{inf„<A<i/PK.) [AP(x'/) + *(A)]} (C> is 
a constant, * represents V or = (vi+c.+^)^.p(.?) ' 

= 2^aMxP{x'l) V 1)(1 + C,(if,P(x?) V 1)), 

and -> (as n ^ cx)) and e [i^i, £'2], (0 < i?i < 1 < 
i?2 < 00). ai V 02 means max{ai,a2}. Following additional 
properties as listed in [11] and originally proved in ||7]|IT] hold 
for (/)-mixing processes: 

1) For an exponentially 0-mixing process, Va;" e y^", there 
exists a positive constant Do and F > 0, s.t. Vn > no 



(5) 



2) Let B„(s) be the set of a;" e A"-; such that i?„(x) < 
-. Then, for any (/)-mixing process, there exists s e J\f 
(J\f being the set of natural numbers), and two positive 
constants Di and di such that 



P{{x : x'l e Bnis)}) < Die~'''^. (6) 

3) For exponentially (/i-mixing processes for every 

X? e ^"\B„(s) 

1^^ - 1| < 1)26"''^". {for n large enough) (7) 

Here, D2 and c?2 are constants. 
Now, we state a Lemma which is required in the proof of 
Theorem 4 stated in section III. Let Ai, A2 and A^ be three 
sets such that A3 = Ai n A2. Suppose P{Ai) > 1 -pie^^^" 
and P{A2) > 1 — qie^''^", where pi, p2, qi and q2 are positive 
constants. Then, we have 



Lemma 1 is proved in the appendix. 
Definition |16| 

X is said to have exponential rates for entropy if for every 
e > 0, we have 

P({< : 2-"(^+') < Pix'^) < 2-"(^-<^)}) > 1 - r(e,n). 

(8) 

where — 7^ In r{e, n) is bounded away from or in other words 
r(e, n) — e^'^(^)", where fc(e) is a real valued positive function 
of e. 

Tlieorem 1 Qll 

1) I.I.D., ergodic Markov and TA-mixing processes all have 
exponential rates for entropy. 

2) An aperiodic and irreducible Markov chain is ^'-mixing. 
Remark 1: In lfT6l . the ^/i-mixing condition used is weaker as 
to what we have defined in Eq. (HJ. So, Theorem 1 also holds 
for processes satisfying the stronger V^-mixing condition given 
in Eq. ([T|- 

Tlieorem 2 S 

1) If a process is V-'-mixing, then it is also (/)-mixing. 

2) If a Markov Chain is ^-mixing then it is exponentially 
^-mixing. 

CoroUary 1: From Theorem 1 and 2, it follows that an 
aperiodic and irreducible Markov Chain is exponentially 
0-mixing and has exponential rates for entropy. 

III. Main Theorems 

Tlieorem 3: For a process satisfying ?/;-mixing condition 
or 0-mixing condition with summable coefficients and with 
exponential rates for entropy, 

Aog Rn{X) 



> 



H + e) <e-f^^'>'^ \fn>N{e). 



where, /(e) is a real positive valued function for all e > 
and /(O) = 0. 

Corollary 2: Under the conditions of Theorem 3, we have 

log m 



Pi- 



> H + e) <e~^^'^'rf^ ym>M{e). 



' Lm{X) 

Theorem 4: For an exponentially (/)-mixing process, 
,logi?„(X) 



Pi- 



< - e) < e-f(''" Vn>7V'(e). 



where g{e) is a real positive valued function for all e > 
and g{0) = 0. 

Corollary 3: Under the conditions of Theorem 4, we have 

log TO 



Pi- 



< 



H-e)< e-»(''*^ V m > M'(e). 



Lemma 1: ^(^3) > 1 - (pi + qi)e' 



min{p2 ,92}" 



-LmiX) 

Theorem 3 and 4 are combined in the form of 

Theorem 5 (Large Deviation Property for Recurrence 

Times) 

For an exponentially (/)-mixing process with exponential rates 
for entropy, 

MRn{X) 



P( 



H\>e)< 2e~-f('=)" Vn > iV"(e). 



where, /(e) = min{/(e), g(e)} and iV"(e) 
max{iV(e),iV'(e)}. 

Remark 2: From, Corollary 1, it can be inferred that the 
quantity '"^''^ for an aperiodic and irreducible Markov 
chain satisfies Large Deviation Property. 

IV. Proofs 

Proof of Theorem 3: 

Let An'' be a set of n long sequences defined as. 



Now, 



\ogRn{X) ^ ^ ^ ^) ^ P{Rn{X) > 2"(«+^)) 
n 

= P{y)P{Rn{X) > 2"(«+')|xr = y) 
= P{y)P{Rn{X) > 2"(^+^)|xr = y) 



yeA 



(5) 



J2 P{y)P{Rn{X) > T<"+')\X^^ = y) 



yeA'^r, 



< 



[l + ii'(y,2"(-f^+^)) 



+ 2d(i?22"(^+^)p(y)Vl)]+ Y P^y) 



(9) 



(«)<= 



where 

y = 2^d{2''("+^^E2P{y) V 1)(1 + d(2"(^+^)P2P(y) V 1)). 

(a) follows from the use of inequality (|4|l and Remark 6 
as stated in Appendix. (6) follows from using the fact that 
^j, e [Ei,E2] and Cy as n —i' oo ^ Cy < d V y and 
n large enough, where c? > is an arbitrary constant. For 
y e An \ we have 

2"('s-i5) < 2"^^+'^)p(?/) < 2"^'+'^). 
For every e > 0, choose S — ^. Consequently, we have 



2^ < 2"("+'')p{y) <2^ \fy € A)^ 



(f) 



(10) 



Also, 2^2^ E2 > 1 since E2 > 1. Hence, using Q and (fTOl i 
we have. 



2^JdE22^{l + dE22^) + 2dE22^)]+ Y ^iv)- 

(11) 



y(AJ 



Using ^ and ( fTTT i. for processes having exponential rates 
for entropy and satisfying ^-mixing condition or 0-mixing 
condition with summable coefficients, we have 



E^e-^^^^ [1 + 2dE22^ 



2^dE22^{l + dE22^)]} +r{^,n) < 



(12) 



This completes the proof of Theorem 3. 
Remark 3: Since the first term on the right hand side of 
inequality (fT2l l stated above rapidly (super exponentially) con- 
verges to 0, /(e) behaves in a similar manner as = 
fc(|). (Also see Remark 7 as stated in Appendix) 
Proof of Corollary 2: From Observation 1, we have 

Rn{x) > 2"(^+') 



P(L2„(«+.)(X) < n) = P{Rn{X) > 2 



n(^r+c)^ 



< e 



V n > A^(e). Now, letting m = 2"(^+^\ we have 



P(L,„(X) < 



log m 

H + e 



)<g-/W^ Vm>M(e) 



, loffTO , log™ 



Proof of Theorem 4: Let An^ ^ be the same set as considered 
in the proof of Theorem 3. For each y e An^\ we have 

2-^ < P(y)2"(^-') < 2^¥. (13) 

Now, 

P(i^^i^ < i/ - e) . 1 - P(i^^i^ > i/ - e) 
n n 

,\osRJX) 
< 1 - P( ^ ^ ^ > H-e) 



1 -P(P„(X) > 2 



= 1 - 



y P(2/)P(P„(X)>2"(^-^)|xr = y) 



2v/^^'(^^^(y)2"*''"'^Vl))] (a) 

<i- Y ^(y)K.e-^^'"^(i-2Vc,(/;22-¥vi))] (6) 



yeA„ 



= 1- E P(y)K,e-^=2"^(l-2y^)] (c) 

(14) 

Here, (a) follows from (|3]l, (6) follows from the fact that 
^j, e [Ei,E2] and inequality ( fT3] l. Also in (&) the negative term 
contributed by sequences belonging to the set An is ignored 
because we are looking at an upper bound, (c) follows because 
eventually E22^^ < 1, since E22~^ — >■ {as n — >■ 00). 

To proceed further, we introduce the following notations, 
let Ai = A''\Bn{s); A2 = aIP . From dSll and dSj, we have 



P{Ai) > l-Die-''i" andP(^2) > 1 respectively 
for processes with exponential rates for entropy. Let A3 = 

Ai n A2. 

Therefore from (Il4t . we have 



-£22 ~ 



ycA2\A3 

<i - E^(y)(i-^2e'*^") 

< 1 - e^-^^^-^ _ ^ -^)g-min{di,fe(£)}«-) 

(l-i52e-''^")(l-2/D;e-'^)] (e) 



(l-2v/i^e-^) (d) 



1 - P(^3)(l - i:)2e-''=")e--^^^ ' (1 - 2^yDoe-"^) 



(15) 



Here, (d) follows from (|7]l and (|5]l and ignoring the negative 
contribution made by the sequences in the set A2\A3. (e) 
follows from Lemma 1. C" > (constant) and u{e) (positive 
valued function V e > and if e = 0) are obtained after 
simplification of (e). Now, using extended mean value theorem 
for the function e~^, 



'E22 — 



l-E22~- 



-Ei2- 



Here, c e (0,i?22 "2 ). Therefore, we have 



Hence, using ( fT6] l in ( fTSl ). we get 



(16) 



(17) 



where g{e) is a positive valued function V e > and 17(0) = 

0. This completes the proof of Theorem 4. 

Proof of Corollary 3: Using Observation 1, we have 

^ PiL2MH-,)iX) >n)< P{Rn{X) < 2"(^-<^)) < e-s(^)'^ 
V n > N'{e). Now, letting m = 2"(^""), we have 



P{Pm{X) > 



log m 



H 



) < e-f(^)^ V m > Af'(e) 



PiP^ < H ~ e) < e-^^^y-^- 



" L,n{X) 

Remark 4: Note that in the first step in Eq. (fT4l i we have a 
term 1 - p(i2g^^ > H - e) = P{Rn{X) < 2"(^-<^)). 
Further, in the proof of Theorem 4, the bound e^^*^*^^" is 
obtained on this term. Hence, there is no ambiguity in using 
the exponential bound obtained in Theorem 4 on P{Rn{X) < 

2n{H-e)\ 



V. Estimator for Entropy 

Motivated by experimental results on estimators based on 
match lengths given in |13|, we propose an estimator based 
on recurrence times as given below: 
Estimator: Consider i?„ = Rn{T'X). 
Define: J„(X) = ^E2r^i2£i^^. 

Proposition 1: If Q{n) is of the polynomial order, then 
for processes which are exponentially (/)-mixing and have 
exponential rates for entropy, lim„_j.oo Jn{X) — H a.s. with 
Jn{X) satisfying large deviation property. The proof of the 
proposition is given below: 

P(.4X)>H + ., = P(J-^|'!2Silffi>H + ,, 

<E^( :;^^ >^+^) («) 

i=l 
Q(n) 

< E e--^(')" V n > N{e) (b) 

(18) 

Here, step (a) follows from Remark 8 given in appendix and 
step (6) follows from the stationarity of the source X and 
Theorem 3. Similarly, 

PiMx) <H~.)^Pi^, E < ^ -) 



~Q{n) ^ 



<Y.P P''f^ <H-e) (a) 

i=l 
Q(n) 

< E V n > iV'(e) (6) 

i=l 

= Q{n)e-sO. 

(19) 

Here, step (a) follows from Remark 8 and step (6) follows 
from the stationarity of the source X and Theorem 4. There- 
fore, combining ( fTSl l and (fT9] l, we have 

F(| J„(X) - i/| > e) < 2g(n)e-^(')" V n > N" [t) (20) 

where iV"(e) = max{iV(e), iV'(e)}. 
For Q(n) of polynomial order, we have 

00 N"(f.)-l 

Y,P{\UX)-H\>e)< E P{\UX)-H\>e) 

n— 1 n— 1 

00 

+ E 2g(n)e-^(^)" 

n=N"(t) 

00 

< iV"(e) + E 2Q(n)e-^(^)" 

n=Ar"(e) 

< 00. 

(21) 



Hence, by Borel-Cantelli Lemma 



and set Ei = {uj : Zi{uj) > r}. Now, 



lim Jn{X) = H a.s. 



(22) 



Remark 5: The bounds we establish on convergence rates are 
loose, we conjecture that our proposed estimator will converge 
to entropy rate at a faster rate than 2e^^'^^''". 

VI. Conclusion 

In this paper, we have proved the Large deviation property 
for the normalized version of recurrence times for exponen- 
tially 0-mixing processes. Further, we have also shown this 
property to hold for our proposed estimator of entropy based 
on recurrence times. As a future work, it will be interesting 
to answer if there are faster rate functions than /(e) and git) 
in this context, and further on what more classes of processes 
large deviation property holds for normalized version of recur- 
rence times. Also, one can conduct experimental or theoretical 
studies comparing the convergence rates of the estimator based 
on match length given in |13| and that based on recurrence 
times proposed in this paper 

Appendix 

Remark 6: Note that in step (a), inequality (|4|i has been 
used, however it is important to check if it can be applied. 
This is verified below: 



Pv 



{,/T+Cy + ^y)iyP{y) 

Using lower bounds on P{y) and ^y, we have 



V y e 



Pv 



< .^UL =— V y e 



Since Cy 

2,/cr 



{^T+Cy + ^y)E^ 

(as n — >■ ( 



for a given d! > 0, 



< d' for n large enough. Now, we choose d' 



-y/l + Cy+-y/ Cy 

such that < d' < El. Since eventually 6 is chosen to be less 
than e, we have 



Py 



^2^iH+6) <2"(^+^) \fyeAlf\ 



Remark 7: Note that, though we prove Theorem 3 under the 
restriction of certain mixing conditions and using inequality 
dill, it can also be proved using Markov Inequality and Kac's 
Lemma under no restriction of mixing. However, the super 
exponential behavior shown by first term in the proof of 
Theorem 3 (see Inequality (fT2] i) is not evident from this 
alternative proof for mixing sources considered. Due to space 
limitations, we have omitted this proof. 
Remark 8: Let Zi, Z2, Zm be m real valued random vari- 
ables. Consider the following probability, P{-^ > 



p{-Y^z,>r)<p{yJ-UE^) 

i=l 

m 

< ^ P{E,) {Union Bound) 

i=l 
m 

= ^P(Z,:>r). 

2=1 

Similarly, by changing '>' sign with '<' accordingly, it can 
be proved that 

m m 

P(-^Z, <r)<^F(Z. <r). 

1=1 1=1 

Proof of Lemma 1: 

P{Ai U A2) < 1 ^ P{Ai) + P{A2) - P{Ai n ^2) < 1 
=^ P{Ai n A2) > PiAi) + P{A2) - 1 

P{Ai n A2) > 1 - pie-P^" + 1 - gie-«=" - 1 
P{Ai n A2) > 1 - {pie-P-" + 916-?=") 
P{Ai n yla) > 1 - (pi + qi)e- """{P2,g2}n_ 
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